Estimating false discovery rates for contingency tables
نویسندگان
چکیده
When testing a large number of hypotheses, it can be helpful to estimate or control the false discovery rate (FDR), the expected proportion of tests called significant that are truly null. The FDR is intricately linked to probability that a truly null test is significant, and thus a number of methods have been described that estimate or control the FDR by directly using the p-values of the hypothesis tests. Most of these methods make the assumption that the p-values are uniformly and continuously distributed under the null hypothesis, an assumption that often does not hold for finite data. In this paper, we consider the estimation of FDR for contingency tables. We show how Fisher’s exact test can be extended to efficiently calculate the exact null distribution over a set of contingency tables. Using this exact null distribution, we explore the estimation of each of the terms in the FDR estimation, characterize the asymptotic convergence of the estimator, and show how the conservative bias can be reduced by removing certain tests from consideration. The resulting estimator has substantially less conservative bias than traditional approaches.
منابع مشابه
Multiple Testing in Large-Scale Contingency Tables: Inferring Pair-Wise Amino Acid Patterns in β-Sheets
One of the most common test procedures using two-way contingency tables is a test of independence between two categorizations. Current significant tests such as χ2 tests or likelihood ratio tests provide overall independency but bring limited information about the nature of the association in the contingency tables. This study examines the feasibility of using multiple testing procedures for an...
متن کاملCell Bounds in Two-Way Contingency Tables Based on Conditional Frequencies
Statistical methods for disclosure limitation (or control) have seen coupling of tools from statistical methodologies and operations research. For the summary and release of data in the form of a contingency table some methods have focused on evaluation of bounds on cell entries in k-way tables given the sets of marginal totals, with less focus on evaluation of disclosure risk given other summa...
متن کاملA Stochastic Process Approach to False Discovery Rates
This paper extends the theory of false discovery rates (FDR) pioneered by Benjamini and Hochberg (1995). We develop a framework in which the False Discovery Proportion (FDP) – the number of false rejections divided by the number of rejections – is treated as a stochastic process. After obtaining the limiting distribution of the process, we demonstrate the validitiy of a class of procedures for ...
متن کاملAn Odds Ratio Based Inference Engine
Expert systems applications that involve uncertain inference can be represented by a multidimensional contingency table. These tables offer a general approach to inferring with uncertain evidence, because they can embody any form of association between any number of pieces of evidence and conclusions. (Simpler models may be required, however, if the number of pieces of evidence bearing on a con...
متن کاملTEAM: efficient two-locus epistasis tests in human genome-wide association study
As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wid...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009